Discriminative Learning Methods for Interdependent Decision Problems in Natural Language Processing

نویسنده

  • Hideto Kazawa
چکیده

Many natural language processing (NLP) tasks involve multiple inputs and/or outputs, which are strongly correlated to each other. We call such problems “interdependent decision problems (IDPs)” since the decisions on outputs should be interdependent to each other. In this thesis, we focus on three typical IDPs in NLP and investigate discriminative learning methods for these problems. The first IDP is sentence selection, which is the process of selecting sentences from a document according to some criterion. Sentence selection is an IDP in the sense that whether an item is selected depends on other items in the input set and cannot be decided only from the features of the item. In Chapter 2, we proved that selection problem can be converted into classification problem. Then a new learning algorithm, Selection SVM, is proposed to solve successive selection problems. Experimental results on an artificial dataset and a sentence selection dataset are also reported. The second IDP is multi-topic text categorization, which is a labeling process of assigning all (possibly multiple) relevant topics to a text. Multi-topic text categorization is an IDP decision problem since topics often show strong correlation among them. In Chapter 3, we address the problem of multi-topic text categorization. First We propose a new learning algorithm, Maximal Margin Labeling (MML), and also describe efficient algorithms for MML. MML is tested on datasets of Web pages, and the results are reported. The third IDP is sequence tagging, which is a process of assigning a tag from given tag set to each word in a sequence. Sequence tagging can be called ∗Doctoral Dissertation, Department of Information Processing, Graduate School of Information Science, Nara Institute of Science and Technology, NAIST-IS-DD0461007, August 24, 2006.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminative Learning over Constrained Latent Representations

This paper proposes a general learning framework for a class of problems that require learning over latent intermediate representations. Many natural language processing (NLP) decision problems are defined over an expressive intermediate representation that is not explicit in the input, leaving the algorithm with both the task of recovering a good intermediate representation and learning to cla...

متن کامل

Beyond EM: Bayesian Techniques for Human Language Technology Researchers

The Expectation-Maximization (EM) algorithm has proved to be a great and useful technique for unsupervised learning problems in natural language, but, unfortunately, its range of applications is largely limited by intractable Eor M-steps, and its reliance on the maximum likelihood estimator. The natural language processing community typically resorts to ad-hoc approximation methods to get (some...

متن کامل

روش جدید متن‌کاوی برای استخراج اطلاعات زمینه کاربر به‌منظور بهبود رتبه‌بندی نتایج موتور جستجو

Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...

متن کامل

Score Function Features for Discriminative Learning: Matrix and Tensor Frameworks

Feature learning forms the cornerstone for tackling challenging learning problems in domains such as speech, computer vision and natural language processing. In this paper, we consider a novel class of matrix and tensor-valued features, which can be pre-trained using unlabeled samples. We present efficient algorithms for extracting discriminative information, given these pre-trained features an...

متن کامل

Score Function Features for Discriminative Learning: Matrix and Tensor Framework

Feature learning forms the cornerstone for tackling challenging learning problems in domains such as speech, computer vision and natural language processing. In this paper, we consider a novel class of matrix and tensor-valued features, which can be pre-trained using unlabeled samples. We present efficient algorithms for extracting discriminative information, given these pre-trained features an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006